Abstract
Finding ethical, platform-independent, computationally efficient methods of adding contextual information to the hate speech detection task is difficult. Methods that rely only on the text for successful classification are of extreme importance. Emotion information extracted from text has been shown to be effective for sentiment analysis and thus we hypothesize that it could have a potential for hate speech. In this study, we propose several methods of introducing emotions into the task of hate speech detection. Using an emotion lexicon, we counter-fitted pre-trained word embeddings (Word2Vec, GloVe, FastText) and also generated a binary and a weighted emotional embedding vector. These were used as features for classification on four publicly available hate speech datasets. Our results and analysis demonstrate that the inclusion of emotion information especially anger, sadness, disgust, fear are helpful for hate speech detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alorainy, W., Burnap, P., Liu, H., Javed, A., Williams, M.L.: Suspended accounts: a source of tweets with disgust and anger emotions for augmenting hate speech data sample. In: 2018 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 2, pp. 581–586 (2018)
Bilewicz, M., Soral, W.: Hate speech epidemic. the dynamic effects of derogatory language on intergroup relations and political radicalization. Political Psychol. 41(S1), 3–33 (2020)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information (2017)
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the 11th International AAAI Conference on Web and Social Media, pp. 512–515. ICWSM ’17 (2017)
Fortuna, P., Soler, J., Wanner, L.: Toxic, hateful, offensive or abusive? what are we really classifying? an empirical analysis of hate speech datasets. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6786–6794. ELRA, France (2020)
Founta, A.M., et al.: Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior (2018)
Friedman, M.: A comparison of alternative tests of significance for the problem of \(m\) rankings. Ann. Math. Statist. 11(1), 86–92 (1940)
Gao, L., Huang, R.: Detecting online hate speech using context aware models. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 260–266. INCOMA Ltd., Bulgaria (2017)
Hill, F., Reichart, R., Korhonen, A.: SimLex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)
Hovy, D., Fornaciari, T.: Increasing in-class similarity by retrofitting embeddings with demographic information. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 671–677. ACL, Belgium (2018)
Koufakou, A., Scott, J.: Lexicon-enhancement of embedding-based approaches towards the detection of abusive language. In: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, pp. 150–157. ELRA, France (2020)
Kwok, I., Wang, Y.: Locate the hate: detecting tweets against blacks. In: AAAI (2013)
Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 302–308. ACL, Maryland (2014)
Loper, E., Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1, pp. 63–70. ETMTNLP ’02, ACL, USA (2002)
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Madukwe, K.J., Gao, X., Xue, B.: A ga-based approach to fine-tuning bert for hate speech detection. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 2821–2828 (2020)
Madukwe, K.J., Gao, X.: The thin line between hate and profanity. In: 32nd Australasian Joint Conference on Artificial Intelligence, pp. 344–356. Australia (2019)
Madukwe, K.J., Gao, X., Xue, B.: In data we trust: a critical analysis of hate speech detection datasets. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 150–161. ACL, Online (2020)
Martins, R., Gomes, M., Almeida, J.J., Novais, P., Henriques, P.: Hate speech classification in social media using emotional analysis. In: 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 61–66 (2018)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Mohammad, S.M.: Word affect intensities. In: Proceedings of the 11th Edition of the Language Resources and Evaluation Conference (LREC). Japan (2018)
Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34. CAAGET ’10, ACL, USA (2010)
Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013)
Mollas, I., Chrysopoulou, Z., Karlos, S., Tsoumakas, G.: Ethos: an online hate speech detection dataset (2020)
Mrkšić, N., et al.: Counter-fitting word vectors to linguistic constraints. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 142–148. ACL, San Diego, California (2016)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. ACL, Doha, Qatar (2014)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Plutchik, R.: Chapter 1 - a general psychoevolutionary theory of emotion. In: Theories of Emotion, pp. 3–33. Academic Press (1980)
Rajamanickam, S., Mishra, P., Yannakoudakis, H., Shutova, E.: Joint modelling of emotion and abusive language detection (2020)
Safi Samghabadi, N., Hatami, A., Shafaei, M., Kar, S., Solorio, T.: Attending the emotions to detect online abusive language. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 79–88. ACL, Online (2020)
Seyeditabari, A., Tabari, N., Gholizade, S., Zadrozny, W.: Emotional embeddings: Refining word embeddings to capture emotional content of words (2019)
Vulić, I.: Injecting lexical contrast into word vectors by guiding vector space specialisation. In: Proceedings of The Third Workshop on Representation Learning for NLP, pp. 137–143. ACL, Melbourne, Australia (2018)
Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: From paraphrase database to compositional paraphrase model and back. Trans. ACL 3, 345–358 (2015)
Yu, L.C., Wang, J., Lai, K.R., Zhang, X.: Refining word embeddings for sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 534–539. ACL, Copenhagen, Denmark (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Madukwe, K.J., Gao, X., Xue, B. (2021). What Emotion Is Hate? Incorporating Emotion Information into the Hate Speech Detection Task. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13032. Springer, Cham. https://doi.org/10.1007/978-3-030-89363-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-89363-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89362-0
Online ISBN: 978-3-030-89363-7
eBook Packages: Computer ScienceComputer Science (R0)